Agent-Scars

The problem

Most AI agents loop because they have no memory of failure. An agent that tries the same syntactically invalid code three times in a row isn't dumb: it has forgotten the first two tries. The context window resets, the previous error scrolls out of view, and the agent confidently produces the exact same broken output.

In production Agentic OS, this manifested as the code-generation agent producing identical syntax errors across 3 to 5 consecutive turns. The fix wasn't "use a better model" - it was giving the agent persistent failure memory so it could see what it already tried and failed at.

How it works: step by step

The system uses a three-beat pattern: Record → Detect → Guard. Each beat is a single function call.

Step 1: Record the incident. When an agent fails, such as a syntax error, a rate limit, a timeout, or a bad API call, you call scar.recordIncident(type, agent, provider, statusCode, message). This logs the failure to SQLite (with full ACID guarantees) or to a JSON fallback file if better-sqlite3 isn't installed. Each incident records: the error type, which agent failed, which LLM provider was used, the HTTP status code, and the raw error message. Incidents are keyed by workspace and project, enabling multi-tenant tracking.
Step 2: Detect repeating patterns. Before the next turn, you call scar.detectFailurePatterns(scars). This scans recent failures against a bank of 8 built-in regex patterns (syntax errors, ESM/CJS conflicts, rate limits, path traversal, etc.) and counts how many times each pattern has fired. If a pattern repeats 2 or more times, it's flagged with a specific human-readable fix instruction. You can extend the pattern bank by adding entries to the knownFixes array in the constructor.
Step 3: Inject the guard block. Call scar.injectRepeatGuard(prompt, scars). If patterns were detected, a structured warning block is prepended to the agent's system prompt. The guard block looks like a formatted instruction section that the LLM reads before generating its response. It explicitly lists each repeated error, how many times it occurred, and the specific fix instruction. The agent reads this, recognizes the pattern, and changes its approach.

The guard block: what the agent actually sees

### !!! REPEAT FAILURE GUARD - DO NOT REPEAT THESE ERRORS !!!
The following errors have occurred MULTIPLE times in this session.
Prioritize fixing them:

▶ Pattern: SYNTAX COMPLIANCE ERROR (Failed 2 times)
  Fix: Output must be valid syntax. Verify braces, brackets,
       parentheses, and commas. Never truncate output mid-block.

▶ Pattern: ESM IMPORT IN COMMONJS (Failed 2 times)
  Fix: File runs in CommonJS. Use require() instead of import.

### !!! END REPEAT FAILURE GUARD !!!

The agent reads this before it sees the user's actual request. It's equivalent to a senior engineer tapping the junior on the shoulder and saying "you already tried that twice and it broke both times - here's what to do differently."

Built-in failure patterns

SCAR ships with regex matchers for the 8 most common agent mistakes in production:

SYNTAX COMPLIANCE ERROR. Invalid JavaScript or Python syntax: unclosed braces, missing commas, truncated output blocks.
ESM IMPORT IN COMMONJS. The agent uses import in a file that runs as CommonJS. Fix: use require().
ROUTE FACTORY EXPORT. Express route files must export a factory function, not a raw router.
BANNED CALL EXHAUSTION. eval() and exec() are forbidden in the codebase.
PATH TRAVERSAL ATTEMPT. No .. or absolute paths allowed in file operations.
INVALID SEARCH/REPLACE DIFF. Search blocks in diffs must match the target file exactly.
API RATE LIMIT TRIGGER. Provider returned 429: back off and switch providers.
REQUEST TIMEOUT EXPIRED. LLM inference took too long: reduce prompt size or switch models.

Storage: SQLite with JSON fallback

If better-sqlite3 is installed, incidents go to ./data/scars.db with full ACID guarantees, WAL mode, and indexed queries. If the native module isn't available (e.g. in a sandboxed CI environment), SCAR automatically falls back to ./data/incidents.json - same API, same behavior, just file-based persistence. You'll see a single log line:

[SCAR] SQLite unavailable. Falling back to JSON incident logs.

How to run it

git clone https://github.com/shubham0086/Agent-Scars
cd Agent-Scars
npm install

# See the guard block appear after the 3rd failure
npm run demo:guard

# See incidents persist across separate sessions
npm run demo:persist

No API key needed. Runs fully offline in mock mode.

The API

import { SCAR } from 'agent-scars';

// 1. Create instance (multi-tenant by workspace + project)
const scar = new SCAR('my-workspace', 'my-project');

// 2. Record a failure
scar.recordIncident('syntax_error', 'coder', 'openai', 400, 'Unexpected token }');

// 3. Get recent failures
const scars = scar.getRecentScars(10);

// 4. Detect patterns
const patterns = scar.detectFailurePatterns(scars);
// → [{ pattern: 'SYNTAX ERROR', count: 2, hint: '...' }]

// 5. Inject guard into prompt
const guarded = scar.injectRepeatGuard(originalPrompt, scars);
// → prepends warning block if patterns found

Real-world use cases

Code generation agents that keep producing the same syntax errors across consecutive turns.
Research agents that retry failed API calls without backoff or provider switching.
Multi-step pipelines where one agent's mistake cascades into the next agent's failure input.
Long-running sessions where an agent drifts and repeats errors it made 20 turns ago.

Where this fits

Agent-Scars is the memory + consistency proof from the autonomy ladder. It implements Pattern 03 (Reality-First Memory) and Pattern 07 (Anti-Drift) from Agentic Patterns. The full production engine that uses SCAR in its pipeline is AgentKernel. For cross-session solution memory (the positive counterpart to failure memory), see Agent-Recall.

Honest framing

This is pattern detection via regex, not semantic understanding. It catches the 8 most common production failures reliably. It won't catch novel error categories unless you add them to knownFixes. The v2 roadmap includes LLM-assisted pattern extraction for unknown error types, but the regex approach has been production-stable for 18 months and handles the vast majority of real cases.

The Three-Beat System : Annotated Reference